Prosodic Word Boundaries Prediction for Mandarin Text-to-Speech

نویسندگان

  • YanQiu Shao
  • JiQing Han
  • Ting Liu
  • YongZhen Zhao
چکیده

In Mandarin speech, the Prosodic Word (PW) is the basic rhythmic unit instead of Lexical Word (LW), and the naturalness of TTS will be directly influenced by the segmentation of PW. Most of the PWs are the combination of some LWs. In this paper, three models, i.e. a directed acyclic graph (DAG) model, segmentation model and Markov Model (MM) combined with Transformation-Based Error Driven (TBED) learning algorithm are designed to combine lexical words into prosodic words. Considering some long LWs should be broken into two or more PWs, a long word break model is also applied to those LWs. Experimental results show that MM combined with TBED plus a long word break model is the best one among the three methods, and 93.00% precision and 93.23% recall are achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion

In this paper, a recurrent neural network (RNN) based prosodic modeling method for Mandarin speech-to-text conversion is proposed. The prosodic modeling is performed in the post-processing stage of acoustic decoding and aims at detecting word-boundary cues to assist in linguistic decoding. It employs a simple three-layer RNN to learn the relationship between input prosodic features, extracted f...

متن کامل

Prosodic Word: the Lowest Constituent in the Mandarin Prosody Processing

This paper proposed a novel method, which is using prosodic word as the lowest constituent in the prosody processing, to solve the prosody problem of Mandarin concatenative speech synthesizer based on a large corpus. The results, obtained from applying new solution to deal with the intonational prominence placement and break boundaries assigning in textto-speech systems, are positive and encour...

متن کامل

Locating Boundaries for Prosodic Constituents in Unrestricted Mandarin Texts

This paper proposes a three-tier prosodic hierarchy, including prosodic word, intermediate phrase and intonational phrase tiers, for Mandarin that emphasizes the use of the prosodic word instead of the lexical word as the basic prosodic unit. Both the surface difference and perceptual difference show that this is helpful for achieving high naturalness in text-to-speech conversion. Three approac...

متن کامل

The Parsody System: Automatic Prediction Of Prosodic Boundaries For Text-To-Speech

Modern text-to-speech (TTS) systems are quite good at word level synthesis, but tend to perform badly on connected word sequences. It has been suggested that the poor prosody of synthetic connected speech is the primary factor leading to difficulties in comprehension [1,5]. TTS systems must therefore incorporate better mechanisms for prosodic processing. For the purpose of this article, prosodi...

متن کامل

Mandarin Text-to-speech Synthesis

This chapter introduces Mandarin Text-To-Speech (MTTS) synthesis. Beginning with a brief review on the development history of MTTS and attributes of MTTS, three main constituents of the technology are presented: 1) Text processing: word segmentation, disambiguation of polyphones, and analysis of rhythm structure; 2) prosodic processing: features of Mandarin prosody, and prosody prediction, and;...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004